Doubly elastic net regularized online portfolio optimization with transaction costs

Online portfolio optimization with transaction costs is a big challenge in large-scale intelligent computing community, since its undersample from rapidly-changing market and complexity from varying transaction costs. In this paper, we focus on this problem and solve it by machine learning system. Specifically, we reformulate the optimization problem with the minimization over simplex containing three items, which are negative expected return, the elastic net regularization of transaction costs controlled term and portfolio variable, respectively. We propose to apply linearized augmented Lagrangian method (LALM) and the alternating direction method of multipliers (ADMM) to solve the optimization model in a higher efficiency, meanwhile theoretically guarantee their convergence and deduce closed-form solutions of their subproblems in each iteration. Furthermore, we conduct extensive experiments on five benchmark datasets from real market to demonstrate that the proposed algorithms outperform compared state-of-the-art strategies in most cases in six dimensions.

www.nature.com/scientificreports/risk 24,25 , and the explicit is from the transaction.Particularly, we only focus on explicit transaction costs, such as taxes 26 , buying and selling transaction fees, since they influence more on retailers while we incline to research the online portfolio for retailers.
The first explicit extension was proposed by Bauer et al. 27 that Cover's Universal Portfolio 9 is still suitable for imposing transaction costs.However, they did not take transaction costs into the decision process.Albeverio et al. 28 proposed a new transaction costs optimization model, where reformulated transaction costs as the distance between portfolios.Certainly, it is widely to manage transaction costs via machine learning system.Györfi et al. 29 augmented the original Markowitz objective function by adding a penalty term proportional to the sum of the absolute of the portfolio weights, which encourage sparse portfolios and allow transaction costs to be considered.Das et al. 30 applied machine learning method to study transaction costs problem, and solve it by GP algorithm.Furtherly, Li et al. 31 solve portfolio selection problem with transaction costs by proposing a TCO framework, which can get closed-form formulae for portfolio update, and they also found the relationship between transaction costs and portfolios.
Although online portfolio optimization have been studied for decades, it lacks of models considering both transaction costs and sparsity as well as correlations of portfolio variables, simultaneously.Thus, we aim to design in this paper a mathematical model concerning transaction costs as well as regularization of portfolios to enhance the cumulative net wealth and the generalization ability of the model.Noting that elastic net regularization can perform automatic variable selection and maintain related variable groups, we propose a doubly elastic net regularized model for portfolio selection problems.We then apply the linearized augmented Lagrangian method (LALM) and alternating direction method of multipliers (ADMM) to solve the proposed model.It is worth mentioning that guaranteeing the solving algorithm converging to the corresponding model is difficult.A number of researches 18,19,31 did not prove the convergence of the algorithm theoretically, but we do that.Numerical experiments show the efficiency of the proposed algorithms.The innovations and main contributions of this paper are as follows: • We propose a minimization problem over simplex which concerns transaction costs and regularization of portfolios simultaneously.The objective of the problems contains three terms: the negative expected return, the elastic net regularization of the difference between the portfolios of the next and last periods to control the transaction cost, the elastic net regularization (reduces to the square of L2 norm due to the simplex) of the portfolio vector to improve the generalization of the model.Thanks to the properties of the elastic net, the proposed model considers the transaction cost, the sparse property and the correlation between variables simultaneously.• In order to solve the proposed model, we apply the LALM to the model and demonstrate the sequence gen- erated by the algorithm converges to a solution of the proposed model.Further, the closed-form solution of the subproblem in each iteration is established, enabling the computational efficiency of the algorithm.We point out that LALM does not need to project any vector to the simplex set in each iteration, which may save computational time in practice.• We further apply ADMM to solve the proposed model by appropriately splitting the variable into two vari- ables.The convergence of ADMM is established through proving the existence of saddle point of the corresponding augmented Lagrangian function.Different from LALM, each update of ADMM is restricted to the simplex set, which may improve the accuracy of the algorithm.• We compare the proposed algorithms with the state-of-the-art methods for portfolio selection on four bench- mark datasets.Numerical experiments illustrate that the proposed algorithms perform better than other compared methods in most cases.
This paper is organized as follows.We present some preliminaries and related works in section "Preliminaries".The whole portfolio selection system and its solving algorithms are illustrated in section "Portfolio optimization".
Section "Experiment" focuses on the experimental results to evaluate the efficiency of our proposed algorithms and section "Conclusion" summarizes the paper.

Preliminaries Problem setting
In the real market, transaction costs can not be ignored especially in the short-term investment.Assuming that there are m assets invested for T periods in a financial market.The relative prices can be collected as a vector x t = (x t,1 , . . ., x t,i , . . ., x t,m ), t = 1, 2, • • • , T , where x t,i = P t,i P t−1,i denotes the relative price and P t,i is the closed price of i th asset in t th period, respectively.A portfolio vector b with assumptions of self-financed and non-margin and non-shorting, where b t,i indicates the proportion of total wealth invested in i th asset of t th period.
At the beginning of t th period, the portfolio is set to b t , thus the wealth allocation at the end is changed to bt = b t x t <b t ,x t > , where < • , • > is the inner product and • is the element-wise product, since the fluctuation of mar- ket without rebalancing the portfolio during t th period.Supposing transaction costs denoted as γ , researches [27][28][29]31 proposed the relationship of γ and wealth: where w t−1 denotes the net proportion wealth after transaction costs, and || • || p denotes the p norm. Fuher- more, Li et al. 31 pointed out that final cumulative wealth implicated transaction costs should be updated as: (1) where S 0 is always normalized to 1. S T is also called net cumulative wealth, which possesses more computing and comparing significance.

Brenchmarks systems
Uniformly buy-and-hold (UBAH), Beststock (BEST) and Best Constant Rebalanced Portfolios (BCRP) are three benchmark portfolio strategies.UBAH is a simple but widely-used system, which invests evenly at the beginning and holds until the whole period: b 1 = 1 m , . . ., 1 m .Thus, it is obviously that the cumulative wealth of UBAH is

Related works on transaction costs
GyR o rfi et al. 29 extended the Markowitz portfolio framework by adding penalty term to allow proportional trans- action costs considered.The model is widely adopted and attracts an amount of attraction in the area of controlling transaction costs by using penalty term in the portfolio selection problem.They denoted transaction costs related factor as the ratio of net wealth after rebalancing to wealth before rebalancing, that is, w t−1 = N t−1 S t−1 .The rate of proportional transaction costs during sales and buys were denoted by 0 < c s < 1 and 0 < c b < 1 , which are controlled by the following formula They proposed the recursive portfolio strategy as follows where F δ (b, X) is the recursive function formulated by the discounted Bellman equation: where δ t is a discount factor such that δ t → 0 , X t is the homogeneous and first order Markov process, and v(b, b ′ , x) is the inner function defined in 29 .
Li et al. 31 considered another proportional transaction cost model named Transaction Cost Optimization (TCO), which is a sparse portfolio selection model by adding norm penalized.They took transaction costs as the regularization term of the portfolio model by L1 norm penalized, and obtained the closed-form solution of portfolio update through proximal gradient descent method.Li et al. theoretically guaranteed that transaction cost is related to ||b − bt || 1 by proving for which indicated that the net proportion is inversely related to ||b − bt || 1 .Thus they proposed the following model: where xt+1 denotes the predicted price relative vector and is a regularization parameter.They acquired the closed-form solution in the following: where η t is an inner variable, [v] + = max(0, v) , and sign(v) denotes the sign of v. (2)

Price information
We should consider price forecasting method to achieve the data-driven ideology, which can lessen the influence by irrational factors [32][33][34] in the market.Specifically, we formulate the expected return by price forecasting method based on historical information.PAMR 35 and CWMR 12 showed that the predicted relative price in next period is inversely to the current period: x t+1 = 1 x t , which utilized the properties of single-period mean reversion to balance risk and return.Besides, OLMAR 3 exploited muti-period mean reversion to solve the moving case.It proposed that the relative price in next period will revert to the moving average: where w is the window size, and it smooths the price volatility in online portfolio problem.
Moreover, above mean reversion strategies may be sub-optimal subject to the noise from real market, due to the real market is not normally distributed 36 .Meanwhile, the robust median reversion (RMR) 37 is robust to real market and can withstand nontrivial transaction costs, which utilizes L1 median estimator 38,39 and online machine learning.RMR can solve the long tail distribution of real market and is shown below: On the other hand, most investors will follow the trend and keep purchasing rising stocks, thus they always consider P MAX as a potential level that the future price can probably reach.Consequently, a generalized logarithmic return (GLR) 22 was proposed to predict the relative price in next period: In order to illustrate the adaptability to various price processes of the optimization model, and comprehensively predict the relative price, we adopt in the paper above methods to further consider the online portfolio optimization with transaction costs.

The proposed doubly elastic net regularized online portfolio optimization with transaction costs
Considering the goals of maximizing cumulative wealth and minimizing transaction costs, the portfolio model we build in this paper as follows where > 0 , f is the predicted relative price in the next period.Here, we consider the following four cases: + 1 .Since model (13) involves in the elastic net regularization terms for b − bt and b , which will be explained below, we call model (13) the doubly elastic net regularized portfolio optimization (DENRPO) model.It can be found that f T b represents the predicted wealth increasing factor, implying expected return potential of the whole portfolio.Therefore, the goal of maximizing cumulative wealth can be reformulated to this item and negative expected return can change the maximization to a minimization.
Further, we implement the proposed model to manage transaction costs with better generalization.On one hand, transaction costs can be reflected by the wealth growth denoted by net asset proportion.Inspired by the research in that transaction costs influent the net proportion by the distance of b t and bt−1 , we innovatively apply elastic net for this term, since it makes a difference in the following aspect.Firstly, elastic net tends to preserve the highly correlated variables by L2 norm structure while maintaining sparsity by L1 norm, thus it can avoid extreme positions and improve the diversification and stability.Besides, it is according to regression analysis that elastic net is particularly useful when assets are large-scale, since it overcomes poor sample sentiments.The above analyses lead to minimize to trade off transaction costs, in which and η are the regularization parameters controlling transaction costs.The smaller the and η are will the smaller the regularization strength is, meanwhile indicating that the model is more inclined to obtain high returns.When the and η are larger, these regularization terms will make b t+1 more inclined to bt , that is the number of rebalanced assets is reduced, thus lessens the transaction cost.
On the other hand, researches show that the portfolio variable regularization ||b|| makes sense in the portfolio optimization.Fan et al. 40 showed that L1 norm penalty on portfolio variable is equivalent to constraining the risk or utility cumulative statistical estimation error, since constraints on individual assets limits total exposure, (10) thus controlling for risk approximation errors as well as closer the empirical and actual risk.In addition, Brodie et al. 20 proposed that portfolio weights can represent transaction costs.Furthermore, Li 41 proposed that penalizing portfolio weights favors the sparsity and stability of portfolio, since shifting and scaling the portfolio weights derived from the sample estimates towards zero allows small portfolio weights to be set to zero and extremely large positions to be regulated, resulting in sparse and stable portfolios.Besides, DeMiguel et al. 42 showed that the global minimum variance of a portfolio can be generated through the portfolio regularization term.The above researches provide us with ideas for constructing the model, since this regularization contributes to the generalization of the portfolio model can be theoretically guaranteed.Thus we apply elastic net for b , which the form is the combination of L1 norm and the square of L2 norm for the portfolio vector with some tuning parameters, as the regularization term of the proposed model.Since b is restricted to the simplex, the first term of the elastic net is equivalent to the constant 1.Therefore, we need to minimize τ 2 ||b|| 2 2 to control the sparsity and stability of our model, in which τ is the regularization parameter controlling the generalization of the model.Based on above discussions, we propose optimization model (13).
The next theorem establishes the existence of solutions to the problem (13).
Proof It is easy to know that the objective function in model ( 13) is continuous and the constrain m is closed and bounded.Thus we can obtain that problem (13) has optimal solutions.If η > 0 or τ > 0 , the objective func- tion is strongly convex.Therefore, problem (13) has a unique optimal solution.

Solving algorithms
In this section, we develop algorithms to solve the problem (13) in a higher efficiency.

Linearized augmented Lagrangian method
Augmented Lagrangian method (ALM) is an excellent algorithm due to its efficiency for solving the linear equality constrained optimization problem.Hence, we apply the ALM to solve the proposed model (13).Since the nonlinear term in the model will increase the solving difficulty of using general ALM, we develop a linearized augmented Lagrangian method (LALM), which linearizes the quadratic term of ALM, to solve the problem in a higher efficiency.We first introduce the notion of indicator function on R m + , denoted by By this way, the augmented Lagrangian function of problem ( 13) is where ξ ∈ R is the Lagrange multiplier and ρ > 0 is a penalty parameter.Then, the LALM updates b k+1 and ξ k+1 in each iteration by , and α is a parameter which will be described in the following text.
The following lemma tells us that the problem ( 16) can be computed in an easy way.Specifically, the closedform solution of problem (16) only involves in the soft-thresholding operator and the projection onto R m + .

Lemma 1 The closed-form solution for the optimization problem (16) is
where Proof We derive from ( 16) that ( 14) where P T is the projection onto the set T , q = w − bt , that is and The above iterative update process is summarized into Algorithm 1.We point out that our proposed algorithm does not need to compute the projection onto the simplex set.This enables the computationally efficiency of our proposed algorithm.
Vol www.nature.com/scientificreports/It can be proved that the LALM can be equivalently reformulated as the Chambolle-pock algorithm 43 .Thus, we can easily get the following theorem regarding the convergence of the proposed algorithm.
Theorem 2 Let {b k : k ∈ N} be generated by Algorithm 1.Then, there exists a b * ∈ R m such that and b * is an optimal solution of problem (13).
In Algorithm 1, Theorem 2 can be established when parameters ρ and α satisfy α < 1 ρm and ρ > 0 , where m is the number of assets.Theorem 2 indicates that our portfolio update algorithm outputs an optimal solution of model (13), which can be supported by theory.

Alternating direction method of multipliers (ADMM)
From the above discussion, the augmented Lagrangian function of problem ( 13) has a quadratic term ρ 2 1 T b − 1 2 , which will increase the computational difficulty, that is why we linearize the ALM.To further consider this problem, we apply ADMM to solve this problem, since ADMM introduces an auxiliary variable to guarantee that variables can be updated alternately, while remains applying gradient ascent to update the Lagrangian multiplier, which cleverly avoids the tedious process of solving the quadratic term.Specifically, we introduce an auxiliary variable d ∈ R m to approach b , and decompose the iterative update problem of b into a complete quadratic minimum solving problem for b and a soft-threshold solving problem for d .We will illustrate the process in the following text.Besides, since applying ADMM does not require the process of linearization, which leads to a higher accuracy, that it can restrict the solution in the simplex through project b t+1 onto the simplex to form an eligible portfolio, as instructed by Duchi et al. 44 .
In addition, not all formulations of ADMM have saddle points.Few methods take bother to figure out and prove the existence of saddle point.However, we can prove that the augmented Lagrangian function based on the proposed model ( 13) has a saddle point, which makes the iterative formulae of ADMM appropriate.Next we reveal the approach of ADMM applied in the model (13).
We first formulate the problem (13) as where

and
By this way, the alternating direction method of multipliers (ADMM) can be applied to the problem (13) and its augmented Lagrangian function is The ADMM generates a new iterate (b k+1 , d k+1 , y k+1 ) by The following lemma gives the closed-form solution to the above problems.

Lemma 2
The closed-form solutions to b and d in the optimization problem (24) are where D = 1 η+ρ y k + ρb k+1 − ρ bt , and P m is the projection onto the simplex m .
Proof For the iteration of b in the problem (24), we can find that www.nature.com/scientificreports/For the iteration of d in the problem (24), we can solve it in the following: Let w = d − bt , then the formula ( 26) is equal to where D = 1 η+ρ (y k + ρb k+1 − ρ bt ) , then we can acquire the result: The ADMM solving problem ( 13) is summarized in Algorithm 2.
We next establish the convergence of ADMM applied to problem (13) for the existence of a saddle point for the Lagrangian function (23).We show this result in the following proposition.
Vol.:(0123456789) Proof By Theorem 1, we suppose b * is an optimal solution of problem (13).That is Then, the Fermat's rule leads to 0 ∈ ∂(g 1 + g 2 )(b * ) .Besides, it is obviously that g 1 and g 2 can be easily proved their convexity in Theorem 1.Since g 1 and g 2 are convexity and ) .This implies that for all b, d ∈ R m .Second, the proposed model ( 13) is strongly convex if the regularization parameters satisfy.By Slater's theorem 45 , strongly duality holds and there guarantees that for any y ∈ R m as b * = d * .Combining ( 29) and ( 30), we complete the proof.
To integrately illustrate the ADMM applied in the proposed model (13), we are now ready to establish the convergence result in the following theorem, which is a direct consequence of Proposition 3 and Proposition 5.4.1 in 46 .
Theorem 4 Let { b k , d k , y k : k ∈ N} be generated by Algorithm 2.Then, {b k , d k , y k } is a convergent series, {b k − d k } converges to 0 , and {b k } converges to an optimal solution of problem (13).

Experiment Data-sets
We compare the performance of DENRPO and other strategies in four datasets, which are NYSE (O) 9 , NYSE (N) 12,21 , TSE 47 , MSCI 35 and DJIA.These datasets collect the historical relative price information, where the element in i th row and j th column denotes the relative price of j th asset in i th period.NYSE (O) and NYSE (N) are the data collected from the New York Exchange, NYSE(O) contains 36 stocks ranging from 7 March 1962 to 31 December 1984 and NYSE(N) contains 23 survived till 30 June 2010.TSE comes from Toronto Stock Exchange and contains 88 stocks ranging from 4 January 1994 to 31 December 1998.MSCI contains 24 indices that represent the equity markets of 24 countries around the world, ranging from 1 April 2006 to 31 March 2010.The finally dataset DJIA collects the Dow Jones Industrial index of 30 stocks in the whole 2010.The first four datasets mainly test the performance of algorithms in the stock market, and the last dataset is applied for testing algorithms in the long-short transaction.These datasets are publicly available from the real market, so it is effective and comparable to evaluate the proposed optimization model in these datasets.

Parameter setting
In the proposed model, there are three regularized parameters of model, namely , η and τ , and four parameters for algorithm which are ξ , y , α and ρ .Thereinto, α is an inner variable, ξ is alternately updated by dual ascent method in LALM and y is the Lagrangian multiplier updated by ADMM applied in the proposed model.There- fore, the above variables will not affect the performance of the methods so that we do not discuss them.In order to control the iteration, we set the tolerance ǫ = 10 −8 and the max_iteration = 10 8 .We take α = 0.999 ρm , ρ = 0.618 and discuss , η and τ in the following to control the regularization.
We apply the method that fixes two parameters and then change the other to determine the value of each parameter.Assuming that the transaction cost is denoted to γ , for the value of , referring to = 10γ in the TCO framework 31 , we fixed η = τ = 0 and search around = 10γ .Our experiments show that cumulative wealth is relatively high in the same running time when = 10γ for which we take = 10γ .For η and τ , we still fix one in 0 and change values of the other.We arrange and combine η and τ performing better in above step and select the combination making the cumulative wealth higher in the same running time.For simplicity, Table 1 are the results of DENRPO1-OLMAR parameters debugging meanwhile Table 2 are the results of DENRPO2-OLMAR parameters debugging in γ = 0.5% , the value of the parameter result can be understood as the convergence speed of the solving algorithms.It is observing from the experimental result that η = 0.00025 and τ = 0.00005 obtains a relatively outstanding comprehensive performance, thus we decided to apply these two values in all experiments.

Comparison approaches
We employ DENRPO method to solve the online portfolio selection problem based on the above four benchmark data sets.As a comparison, 12 other online portfolio selection algorithms also run in our experiment.Specifically, UBAH, BEST and BCRP are three benchmark approaches, where UBAH is able to reflect the stock price trend of real financial market.SSPO, S1, S2 and S3 are sparse strategies based on short-term investment.TCO1 and TCO2 are excellent approaches considering transaction costs, furtherly, TCO-RMR and TCO-GLR utilize RMR and GLR price prediction on the base of transaction cost optimization framework, respectively.WFDA is the portfolio strategy considering the long-short transaction, which is executed by wavelet feature engineering.The details of the algorithms and their parameter value, which are taken from the original paper or derived from numerical experiments based on the original paper, we list below:

Cumulative wealth with fixed transaction costs
We fix the transaction cost rate to show the daily cumulative return trend of the proposed algorithms implemented on the NYSE(O) and MSCI for simplicity, by observing the trend of wealth growth under fixed transaction costs facilitates evaluating the performance of the algorithms.Figures 1 and 2 are the daily cumulative return trend of the proposed algorithms compared with TCOs in the transaction cost rate fixed in 0.25% .It can be found that the wealth growth structures of DENRPO and TCO strategies are roughly the same, but the daily return of DENRPO can always be a little higher than that achieved by TCO in most cases.Thus after investing for a period, DENRPO can always obtain higher cumulative wealth, which demonstrates the superiority and practicality of the proposed method.

Cumulative wealth with varying transaction costs
To better show the effectiveness of the introduced elastic net term for non-zero transaction costs and portfolio variable, meanwhile analyze the trend of the cumulative wealth in the condition of changing transaction costs, Figs. 3 and 4 as well as Table 3 compare the cumulative wealth achieved by the proposed DENRPO strategies and other methods we list above.
We can draw several observations in them.Firstly in Figs. 3 and 4, the cumulative wealth obtained by the three benchmark algorithms almost distributes in a straight line as transaction costs increase, indicating that the cumulative wealth obtained by these benchmark algorithms is less affected by transaction costs.Since UBAH and BEST will not rebalance the portfolio in the transaction period so that they will not produce the rebalancing  cost, while BCRP fixes daily rebalancing and is less affected by transaction costs.Certainly for above reasons, they only can obtain less wealth.
Moreover, SSPO, S1, S2 and S3 perform better when transaction cost is 0 in Table 3.However, obviously find that their performance greatly reduces when operating in the non-zero transaction costs.In most data-sets, the cumulative wealth close to 0 when transaction costs only close to 0.5% .It is because that the goal of short-term investment is to obtain high return in a short time, thus frequent transactions are required, which will produce a amount of transaction costs.Hence, transaction costs have a greater impact on short-term investment, that leads to cumulative wealth decreases rapidly as transaction costs increase.
In addition, comparing TCO and DENRPO both transaction costs optimization strategies in Figs. 3 and 4, it can be found that the cumulative wealth of DENRPO is significantly higher than that of TCO in most cases.As transaction costs increase, DENRPO strategy can also achieve two small peaks of cumulative wealth in TSE when around γ = 0.3% to γ = 0.7% , indicating that DENRPO strategy has the ability to counteract transaction costs, which ensures the stability and the better out-of-sample performance of the proposed method.Furtherly comparing TCO and DENRPO in Table 3, it is obviously that DENRPO obtains 8.02E+06, 893.22,7.84, 1.30 in NYSE(O), NYSE(N), TSE and MSCI in γ = 0.5% , respectively, which indicates that DENRPO survives better

Mean excess return
To measure the daily return performance of each algorithm, we first need to know whether proportion of total wealth gained or lost on this day.Due to the existence of transaction costs, we develop to represent this concept as a term related to the net proportion wealth: it can be understood as the net proportion gained or lost wealth.Mean excess return (MER) 48 is defined as the average value of the daily excess returns compared with the UBAH strategy in the paper, which is given as follows: where r s,t and r m,t are the daily excess returns of the compared portfolio strategy and the UBAH on the t th day, respectively.
It is obviously by the definition of MER that a superior portfolio strategy should have a larger MER value, and the larger the MER leads to the better the performance of the strategy.Certainly even a small gap in MER can indicate a larger difference in portfolio strategies, especially for the long-term investments 22 .We present the MERs for our proposed methods and the TCO strategy, which both consider the transaction cost, in Table 4.
It is obviously that DENRPO performs the best in most case, since it always gains the biggest MER.For example, the MER are 0.0069, 0.0023, 0.0060, 0.0017 in γ = 0.25% in NYSE(O), NYSE(N), TSE and MSCI, respectively, which even many strategies that do not take into account transaction costs can not achieve.This is the reason why DENRPO outperforms other systems in cumulative wealth.

α Factor
We evaluate our proposed method whether outperforms the benchmark and the TCO method considering transaction costs in a statistical significance.The Capital Asset Pricing Model (CAPM) 49 proposed that intrinsic excess return composes the part of the expected return, which is usually called α Factor in the finance industry 50 .α Factor can be improved by a excellent portfolio strategy and it can be represented in the following way: where ĉ(•, •) and σ (•, •) denote the sample covariance and the sample standard deviation (STD) computed on T trading days, respectively.Table 5 represents the α Factor of our proposed method compared to benchmark and TCO methods.It is obviously that DENRPO achieves 0.0067, 0.0027, 0.0057 in γ = 0.25% in NYSE(O), NYSE(N) and TSE, respectively, which are much higher than TCO.It furtherly guarantees the better performance of the proposed methods.
β Factor In addition to measuring returns, we also need to measure risk indices in order to evaluate a strategy more comprehensively.β Factor is a commonly used risk indicator, which measures the volatility of the portfolio strategy return with respect to the market benchmark.In the case of β > 0 and β < 1 , shows that the strategy return has a positive correlation with the market return, and if β is smaller, then the strategy return fluctuates less greatly than the market return.The calculation method of β Factor is given in (33) and Table 6 shows the result of the proposed method compared with benchmark and TCO methods.It is obviously that DENRPO obtains 1.0851, 0.9954 and 1.2146 in γ = 0.5% in NYSE(O), NYSE(N) and TSE, respectively, which are much smaller than TCO, indicating that the proposed method can keep the stability as transaction costs increase.This test guarantees the generalization ability of DENRPO. (

Sharpe ratio
In order to more synthetically illustrate the empirical superiority of DENRPO, we compare it with WFDA strategy in Sharpe ratio in the long-short baseline.Sharpe ratio measures the excess return in the unit risk, thus the higher ratio leads to better performance of the strategy.It can be computed as: Besides, we compare the long-short baseline setting that simultaneously buys the top five and shorts the bottom five stocks sorted by VaR and CVaR in an hour, and holds the position for one day in DJIA dataset.Table 7 shows the Sharpe ratio of DENRPO, raw long-short term baseline and WFDA-based long-short transaction, where α denotes the confidence level, raw and WFDA denotes VaR and CVaR computed by raw data and WFDA processed data, respectively.We can find that the Sharpe ratio of DENRPO is the highest in all compared strategies, indicating its effectiveness and superiority in the real market.

Conclusion
In this paper, we study the online portfolio selection problem with transaction costs via machine learning.First, we formulate the problem as a minimization problem on the simplex.By minimizing negative expected returns and applying elastic net regularization to transaction cost controlled terms and portfolio variables, a practical and robust model is constructed to achieve the goal of maximizing return while minimizing transaction costs.Since the augmented Lagrangian function based on the proposed model has a quadratic term, we develop to apply LALM and ADMM to solve the model, which subtly reduces the computational difficulty.Further, we theoretically guarantee that the sequences generated by the proposed algorithms converge to the solution of the proposed model, and we also establish the closed-form solutions of the subproblems in each iteration.Moreover, we compare with state-of-the-art portfolio algorithms on five commonly used benchmark datasets.Extensive numerical experiments demonstrate that the cumulative wealth obtained by proposed algorithms outperforms all compared algorithms as transaction costs increasing in most cases and it also outshine in long-short transaction (35) Sharpe ratio = rs − r m σ 2 (r s ) .
Table 6.β Factor obtained by several algorithms on four data-sets with transaction costs.Top two achievements on each column excluding benchmark are highlighted.
where ⊙ denotes the element-wise product.Similarly but differently, BEST invests totally in the best outperformance asset in hindsight and remains unchanged: b 1 = arg max i=1 x t .Besides, BCRP updates portfolio to b * = arg max b∈ m T t=1 log(b T x t ) in each period, where b * denotes the portfolio maximizing return in hindsight.Therefore, the cumulative wealth of BCRP is S T = T t=1 b * T x t .

we have 0
∈ ∂g 1 (b * ) + ∂g 2 (b * ) .Thus, there exists Set d * = b * .We next show (b * , d * , y * ) is a saddle point of L(b, d, y * ) .First, let L(b, d, y * ) take the partial deriva- tive of b at b * and take the partial derivative of d at d * , then we can get which leads to (b * , d * ) is a minimizer of L(b, d, y * ) due to the convexity of L(b, d, y *

Figure 1 .
Figure 1.Log Daily Return obtained by DENRPO1s and TCOs with γ = 0.25% in the NYSE(O) and MSCI.

Figure 2 .
Figure 2. Log Daily Return obtained by DENRPO2s and TCOs with γ = 0.25% in the NYSE(O) and MSCI.

Figure 3 .
Figure 3. Cumulative wealth obtained by DENRPO1s and compared with cumulative wealth obtained by the listed algorithms in variable transaction costs in the four data-sets.

Figure 4 .
Figure 4. Cumulative wealth obtained by and DENRPO2s compared with cumulative wealth obtained by the listed algorithms in variable transaction costs in the four data-sets.

Table 1 .
Cumulative wealth achieved by DENRPO1-OLMAR about different permutations and combinations of η and τ with transaction costs rates is 0.5%.

Table 3 .
= ĉ(r s , r m ) σ 2 (r s , r m ), Cumulative wealth obtained by various algorithms on the four data-sets with transaction costs.Top two achievements on each column excluding benchmarks are highlighted.

Table 4 .
Mean excess return obtained by several algorithms on the four data-sets with transaction costs.Top two achievements on each column excluding benchmark are highlighted.

Table 7 .
Sharpe Ratio obtained by DENRPO and WEDA on DJIA dataset in long-short transaction.Top two achievements of strategies are highlighted.